Transmission of drug-resistant bacteria in a hospital-community model stratified by patient risk

A susceptible-infectious-susceptible (SIS) model for simulating healthcare-acquired infection spread within a hospital and associated community is proposed. The model accounts for the stratification of in-patients into two susceptibility-based risk groups. The model is formulated as a system of first-order ordinary differential equations (ODEs) with appropriate initial conditions. The mathematical analysis of this system is demonstrated. It is shown that the system has unique global solutions, which are bounded and non-negative. The basic reproduction number (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathscr {R}_0$$\end{document}R0) for the considered model is derived. The existence and the stability of the stationary solutions are analysed. The disease-free stationary solution is always present and is globally asymptotically stable for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathscr {R}_0<1$$\end{document}R0<1, while for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathscr {R}_0>1$$\end{document}R0>1 it is unstable. The presence of an endemic stationary solution depends on the model parameters and when it exists, it is globally asymptotically stable. The endemic state encompasses both risk groups. The endemic state within only one group only is not possible. In addition, for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathscr {R}_0=1$$\end{document}R0=1 a forward bifurcation takes place. Numerical simulations, based on the anonymised insurance data, are also presented to illustrate theoretical results.


Transmission of drug-resistant bacteria in a hospital-community model stratified by patient risk
Paweł Brachaczek 1 , Agata Lonc 1 , Mirjam E. Kretzschmar 2 , Rafael Mikolajczyk 3 , Johannes Horn 3 , Andre Karch 4 , Konrad Sakowski 5* & Monika J. Piotrowska 5 A susceptible-infectious-susceptible (SIS) model for simulating healthcare-acquired infection spread within a hospital and associated community is proposed.The model accounts for the stratification of in-patients into two susceptibility-based risk groups.The model is formulated as a system of firstorder ordinary differential equations (ODEs) with appropriate initial conditions.The mathematical analysis of this system is demonstrated.It is shown that the system has unique global solutions, which are bounded and non-negative.The basic reproduction number ( R 0 ) for the considered model is derived.The existence and the stability of the stationary solutions are analysed.The disease-free stationary solution is always present and is globally asymptotically stable for R 0 < 1 , while for R 0 > 1 it is unstable.The presence of an endemic stationary solution depends on the model parameters and when it exists, it is globally asymptotically stable.The endemic state encompasses both risk groups.The endemic state within only one group only is not possible.In addition, for R 0 = 1 a forward bifurcation takes place.Numerical simulations, based on the anonymised insurance data, are also presented to illustrate theoretical results.
An important step paving the road for modern medicine was the discovery of antibiotics.Unfortunately, due to evolutionary processes, the susceptibility of microorganisms to antibiotics diminishes with time.Widespread use of antibiotics intensifies this process and leads to the emergence of antibiotic-resistant bacteria 1-3 and to the need for the investigation of the impact of antibiotic use on mortality 4 .Multiresistant pathogens are often spread within hospital networks by transfers or readmissions of colonised patients from the community [5][6][7] .In recent years, multiple modelling studies were conducted to assess the extent of the problem and introduce potential interventions [8][9][10] .Modelling studies were most often based on admission and discharge data, which also might have contained information on diagnoses of the patients [11][12][13][14][15][16][17][18][19] .Nevertheless, these previous research generally did not consider differences between individual patients but rather focused only on patient streams between institutions.In consequence, all patients were considered the same.
On the other hand, the considered model belongs to a family of so-called compartmental patch SIS models [20][21][22] or multi-group SIS models [23][24][25] .In contrast to [22][23][24][25] , where incidence is modelled by bilinear term, we assume non-linear dependence.Moreover, in modelling our problem, we cannot assume that the connectivity matrix is symmetric as in 20 , but rather we should consider an asymmetric matrix following 21 .Since the interacting individuals are stratified into low-and high-risk groups and we also model the screening process, the model structure of previously considered models is violated.
In reality, patients differ in their risk of becoming colonised during a hospital stay.The risk of colonisation depends on a patient's diagnosis, since it determines their comorbidities and antibiotic use, as well as which hospital ward the patient is admitted to, how frequent their contacts with healthcare workers are, and how long they stay in a hospital.In addition, the diagnoses and comorbidities influence how often a patient is readmitted to a hospital.Patients' diagnoses are stored in hospital records as ICD-10 codes and can be used for risk stratification.This was first done in 26 , where the authors introduced a risk-stratified transmission model within hospitals.Heterogeneity in the risk of colonisation likely influences the transmission dynamics of resistant

Single hospital-community pair model taking into account patient risk groups
The spread of bacteria within a single hospital-community pair can be modelled by a modified version of a susceptible-infectious-susceptible (SIS) model, considered e.g. in Piotrowska et al. 17 .To distinguish two patient risk groups, we introduce additional variables to the model, indexed by i ∈ {1, 2} .Here i = 1 denotes the low-risk group and i = 2 the high-risk group.
For each patient risk group i, we define the following variables as fractions of the total population: susceptible individuals in the hospital S i , colonised individuals in the hospital I i , susceptible individuals in the community V i , and colonised individuals in the community W i , and thus 2 i=1 (S i + I i + V i + W i ) = 1 .We assume that pathogen transmission can occur only in the hospital (c.f. 17,18), while the clearance of colonisation takes place in both the hospital and the community.Susceptible individuals (in the hospital) of group i are exposed equally to colonised individuals of both risk groups.Moreover, following 26 , we assign patients to risk groups according to their medical history, so that they do not migrate between risk groups, for details see section "Patient stratification and parameter estimation".
Thus, we assume that model parameters depend on risk group i, and as a consequence parameter β i > 0 denotes the susceptibility-based transmission rate for i-th risk group, while the corresponding clearance rate is denoted by γ i > 0.
Patients are discharged from the hospital at rate α i and readmitted at rate ε i , where α i , ε i ∈ (0, 1].Further- more, we assume that patients are screened at admission, and if they are found positive, they are decolonised with probability 0 ≤ σ < 1 and then enter the hospital as susceptible.
The process considered above can be described by the following system of ODEs: for i = 1, 2, where S = S 1 + S 2 , I = I 1 + I 2 .Terms β i I I+S S i and γ i I i , γ i W i represent the processes of colonisation of susceptible patients and decolonisation (recovery) of colonised patients, respectively.Terms α i S i , α i I i describe discharge of susceptible and colonised patients, and terms ε i V i , ε i W i describe admission of susceptible and colo- nised individuals from the community, respectively.Term σ ε i W i describes screening and further decolonisation of colonised individuals at the admission, while the term (1 − σ )ε i W i describes admission of colonised patients who are not successfully decolonised yet.
To simplify the notation, we write where G 1 and G 2 are constants.As all the variables denote fractions of the population, we have In addition, the fractions of individuals from the i-th group in the hospital and the community are denoted by respectively.To close system (1), we assume that initial conditions satisfy:

Mathematical properties of the single hospital-community pair model with patient risk groups
First, we focus on the basic properties of the solutions of the considered system. (1a) Statement 1 System (1)-( 2) has global and unique solutions which are non-negative and bounded from above by 1.
Proof Since the right-hand side of system (1) is continuous with respect to t and locally Lipschitz continuous with respect to S i , I i , V i , W i , system (1)-( 2) has local and unique solutions as a direct consequence of Picard- Lindelöf theorem.
Let us observe that for t ≥ 0 so since S(t) + I(t) = H 1 (t) + H 2 (t) , the right-hand side of (1) is a smooth function on the interval of exist- ence of solutions.
In order to prove the non-negativity of the solutions, we recall the Taylor formula.Let g ∈ C n be a function defined on an interval [0, t 1 ) such that g(0) ≥ 0 .By 0 ≤ t 0 < t 1 we denote the first time point at which g is equal to 0. Then where m ≤ n − 1 and lim t→t 0 R m (t,t 0 ) (t−t 0 ) m = 0 .Let k be the index of the first non-zero derivative of g at point t 0 .If dt k is positive, then there exists δ > 0 such that for all t ∈ (t 0 , t 0 + δ) we have Then we can repeat this proof for any subsequent roots of g.
Consider system (1)-( 2).Let t 0 ≥ 0 denote the first time point at which any of the variables S i (t) , I i (t) , V i (t) , W i (t) , i = 1, 2 (possibly more than one) is equal to 0 while the remaining variables are positive.
If S i (t 0 ) = 0 , then from (3a) we have If W i (t 0 ) = 0 then the sign of dW i (t 0 ) dt is the same as the sign of I i (t 0 ) .For I i (t 0 ) = 0 , conditions W i (t 0 ) > 0 or and dI i (t 0 ) dt both have the same sign as ) are equal to 0, then I 1 (0) , I 2 (0) , W 1 (0) , W 2 (0) are also all equal to 0. Otherwise, the solution of (1)-( 2) would intersect with a solution ( S 1 , I 1 , V 1 , W 1 , S 2 , I 2 , V 2 , W 2 ) describing a population without any colonised patients, i.e. for i = 1, 2 with initial conditions S i (0) = S i (t 0 ) , V i (0) = V i (t 0 ) , i = 1, 2 .Since we proved that the solutions are unique, such a situation cannot take place.If I 1 (0) , I 2 (0) , W 1 (0) , W 2 (0) are all equal to 0, then the solutions of system (1)-( 2) are non-negative from (3), since Thus, S i (t), I i (t), V i (t), W i (t) ∈ [0, 1] for every t from the interval of existence.From this observation and the fact that S(t) + I(t) > 0 , it is trivial to check that the right-hand side of system (1) is bounded.Hence, solutions of system (1)-( 2) are bounded functions with bounded first-order derivatives and the solutions can be extended globally.

Existence of steady states
Let the upper asterisk denote the values of the variables at a steady state.Direct calculations of steady states of (1) lead to the following dependencies Moreover, we have since Thus, the disease-free steady state of (1) is given by and it exists for all values of the parameters.
Let us consider steady states of (1), for which I * 1 , I * 2 > 0 .Using W * i from ( 5) and which is equivalent to where describes the probability that an individual from the i-th group, colonised on the discharge, remains colonised on the readmission.Furthermore, we denote Parameter ψ i is a product of two factors.The first one, β i α i +γ i , indicates the number of secondary cases one infectious individual would cause during their stay in the hospital, if all individuals belonged to the i-th group.The second factor describes the sum of probabilities that a patient from that group remains colonised upon any subsequent admission to the hospital.Parameter ψ i is simply a basic reproduction number for a model describing a single risk group i, which is equivalent to the model presented in 17 .
Since S i = H i − I i , from (8) we derive In order to find explicit formulae for I * 1 , I * 2 , we transform (11)-( 12), obtaining where (5) where H * i and ψ i are given by ( 6) and ( 10), respectively.
Proof The existence of the disease-free steady state is obvious.By (14), we find that Thus, any non-negative endemic steady state of (1) is given by real solutions I * 1 to Eq. ( 13) satisfying If ψ 2 > ψ 1 (i.e.A > 0 ), then Eq. ( 13) is quadratic.We find that A > 0 and C ≥ 0 implies B > 0. Hence, by Vieta's formulae, Eq. ( 13) has a single positive real solution if and only if C < 0. Similarly, if ψ 2 = ψ 1 , then we get B > 0 and the positive solution to linear Eq. ( 13) exists only for C < 0. In both cases, it can be easily shown that such solutions satisfy condition (17).Since H * i > 0, inserting formula (15) into condition C < 0 yields On the other hand, when ψ 1 > ψ 2 , due to the symmetry of a problem, it is sufficient to consider polynomial analogous to (13), but with respect to I * 2 instead.It would be equivalent to swapping indices and the considered equation would also have a single positive real solution, which would satisfy condition analogous to (17).
Concluding, if (16) holds, then system (1) has a single endemic steady state.Otherwise, it has no non-negative steady states other than the disease-free steady state x 0 .
Remark 3 Theorem 2 illustrates the impact of the introduction of two risk groups interacting with each other.If these groups were separate, the endemic state would be present independently in these groups, provided that ψ i > 1 for the given group (using results of 17 ).If both groups are present in the same hospital and they interact with each other, their individual basic reproduction numbers ψ i lose their previous interpretations.Instead, the endemic state of the system depends on the sum of these numbers with weights, i.e.
These weights are simply proportions of the given group population to the total population of the hospital.From the proof, we also conclude that there cannot be an endemic state within one group, with a disease-free state in the other group, as either I * 1 = I * 2 = 0 , or I * 1 > 0 and I * 2 > 0.

Basic reproduction number and stability of steady states
To analyse the local stability of steady states of system (1), we use the next generation matrix approach proposed by Diekmann et al. 27 and van den Driessche and Watmough 28 and derive the formula for the so-called basic reproduction number R 0 .Using V i = G i − S i − I i − W i , we reduce and rewrite system (1) as where (15) (16) The next generation matrix of system (1) is given by FV −1 (x 0 ), while the basic reproduction number R 0 -by the spectral radius of this matrix.
For system (18), matrices F and V have the following form: where Direct calculations of the next generation matrix yield where Thus, FV −1 (x 0 ) has only one non-zero eigenvalue and we have ( 19) www.nature.com/scientificreports/where H * i and ψ i are given by ( 6) and ( 10), respectively.Clearly, R 0 is a weighted arithmetic mean of the repro- duction numbers for the model of disjoint risk groups (10) with weights given by ( 6) (see also Remark 3).
Let us recall that if R 0 > 1, then by Theorem 2 system (1) has an endemic steady state.
Theorem 4 Consider R 0 and E 0 given by ( 23) and (7), respectively.Then 1.For R 0 < 1 system (1) has exactly one non-negative globally asymptotically stable steady state E 0 (called the disease-free); 2. For R 0 > 1 system (1) has two non-negative steady states: E 0 , which is unstable, and endemic steady state E * , which is globally asymptotically stable.3.For R 0 = 1 we observe a forward bifurcation for system (1).
Let us rewrite system (1) into an equivalent form for i = 1, 2 .System (24a, 24b) consists of two cascading subsystems (24a) and (24b).Subsystem (24a) com- prises 2 equations and describes the changes in the total populations of given risk groups in the hospital, while subsystem (24b) includes remaining 4 equations and describes the changes in the colonised populations in both hospital and community.Clearly, solutions of subsystem (24a) do not depend on the variables described by subsystem (24b).First, consider subsystem (24a).From (3) it follows that this subsystem has a globally asymptotically stable steady state given by Next, consider subsystem (24b) at the equilibrium of system (24a), namely is positively invariant with respect to (26).We prove the non-negativity of the variables using the same method as in the proof of Statement 1.Similarly, let t 0 denote the first time any of the variables I 1 , I 2 , W 1 , W 2 reaches its upper bound in set K. By (6), for W i (t 0 ) = C * i we have and for I i (t 0 ) = H * i we have Furthermore, since system ( 26) is cooperative and its Jacobian matrix is irreducible in the interior of K, it generates a monotone flow on K and a strongly monotone flow on interior of K, giving a strongly monotone flow on K as a result ( 29 , Theorem 1.7).It is easy to verify that the Jacobian matrix of system ( 26) is also antimonotone.By 29 , Theorem 6.1 restricted to the set K instead of R n + , we obtain that either all solutions tend to 0 (corresponding to the case R 0 < 1 ), or they tend to a unique steady state I * 1 , I * 2 , W * 1 , W * 2 with all coordinates positive (corresponding to the case R 0 > 1).
If R 0 < 1 , then the attractor 0 of system ( 26) is stable by Theorem 2 in 28 .On the other hand, if R 0 > 1 , then the Jacobian matrix evaluated at Vol:.( 1234567890) belongs to the kernel of matrix (Metzler matrix), so, by Perron-Frobenius Theorem, 0 is a simple eigenvalue of (Metzler matrix) and the remaining eigenvalues have negative real parts.Furthermore, there exists a > 0 such that after adding the matrixa • Id , both Jacobian matrix evaluated at and matrix (Metzler matrix) are non-negative and irreducible.By Corollary 2.1.5 30 ,all eigen- values of the Jacobian matrix evaluated at have negative real parts.Thus, for R 0 > 1 steady state 26) is stable.Let us define a point P as a combination of stable steady states of subsystems (24a) and (26)   For R 0 < 1 the point P corresponds to the disease-free steady state E 0 given by ( 7), while for R 0 > 1 it corre- sponds to the endemic steady state E * , where values I * i , W * i satisfy ( 5), ( 13) and (14).In order to prove the global stability of P as a steady state of ( 24), we use arguments inspired by proof of 31 , Theorem 4.2.
First, by 32 , Theorem 2 or 33 , Theorem 3.1 P is locally asymptotically stable steady state of (24).Thus, we only need to prove the global attractivity of the point P.
Consider a trajectory of system (24) starting at any point satisfying the initial conditions (2).The non-negativity of solutions (see Statement 1) indicates that, for all positive t the following conditions are satisfied Since (H * 1 , H * 2 ) is a globally asymptotically stable steady state of (24a), for every ε > 0 there exists T > 0 such that after time T the considered trajectory of system (24) is contained in set Therefore, the ω-limit set of this trajectory is a subset of Assume that in the ω-limit set of the considered trajectory there is a point From the local asymptotic stability of P there exists such δ > 0 that solu- tions starting at any point in δ-neighbourhood of P do not leave ε-neighbourhood of P.
Observe that the last four variables of the solution to system (24) with initial point P 1 are equal to the solution of system (26) with an initial point , we obtain that the trajectory starting at P 1 converges to P. In particular, for any 0 < η < δ there exists T > 0 such that after time T solution starting at P 1 does not leave η-neighbourhood of P.
From the continuity of solutions with respect to initial conditions, there exists a small enough neighbourhood of P 1 such that after time T all the trajectories starting at this neighbourhood belong to δ-neighbourhood of P and, thus, these trajectories do not leave the ε-neighbourhood of P. Hence, P 1 cannot belong to the ω-limit set of the initially chosen trajectory and P is the only element in this ω-limit set.
Since system (24) and system (1) are equivalent, for R 0 < 1 the state E 0 is globally asymptotically stable and for R 0 > 1 the endemic steady state E * is globally asymptotically stable.As a consequence, for R 0 = 1 we observe a forward bifurcation.Alternatively, to investigate the type of the bifurcation at R 0 = 1 one can follow the approach proposed by van den Driessche and Watmough and check the assumptions of Theorem 4 from 28 ; for details see Statement 5 and its proof.

Statement 5
The following statements are true: 1.For R 0 < 1 system (18) has exactly one non-negative locally asymptotically stable steady state x 0 (called the disease-free); 2. For R 0 > 1 system (18) has two non-negative steady states: x 0 , which is unstable, and endemic steady state x * .Moreover, there exists ε > 0 such that x * is locally asymptotically stable for R 0 satisfying 1 + ε > R 0 > 1 ; 3. For R 0 = 1 we observe a forward bifurcation for system (18), where R 0 and x 0 are given by ( 23) and (21), respectively.Proof Since system (18) satisfies conditions (A1)-(A5) postulated in 28 , the statements regarding the stability of disease-free steady state x 0 are the direct consequences of Theorem 2 in 28 .Namely, x 0 is locally asymptotically stable for R 0 < 1 and unstable for R 0 > 1.
In order to investigate the stability of the endemic steady state and also the type of the bifurcation occurring at R 0 = 1, we use an approach based on the centre manifold theory, as proposed in 28 .Let us define a bifurcation parameter µ = β 2 − β, where Vol Below we show that there exists a δ > 0 such that there is a locally asymptotically stable endemic steady state near the disease-free steady state for δ > µ > 0 .To do so, we follow van den Driessche and Watmough and check the assumptions of Theorem 4 in 28 , i.e. we verify that a < 0 , b = 0 , for a and b given by where v and w are left and right null-vectors of D x f (x 0 , β) .Furthermore, we check if zero is a simple eigenvalue of D x f (x 0 , β) i.e. the Jacobian matrix of the system (18) evaluated at the point (x 0 , β) .Direct calculations lead to where, for i = 1, 2, Clearly, for x = x 0 we have It is clear that at least two eigenvalues of D x f (x 0 , β) are non-zero, thus it is enough to consider 4 × 4 upper left block of the matrix.Calculating the coefficients of that block's characteristic polynomial directly, it can be shown that the constant term is equal to 0, while the linear-term coefficient is non-zero.Hence, the zero eigenvalue of In addition, the only non-zero second-order derivatives of the right-hand side of ( 18) are equal to the firstorder derivatives of d I i , d S i , d H i with respect to x 1 , . . ., x 6 : www.nature.com/scientificreports/For x = x 0 we have x 1 , x 2 = 0, and x 4+i = H * i , hence evaluating those derivatives at point (x 0 , β) we get Lemma 3 in 28 indicates that v 5 = v 6 = 0 .Thus, formula (29) for a can be rewritten as where Having calculated the required derivatives at (x 0 , β), we can factor out (H * 1 + H * 2 ) −2 and write down the terms of a 1 : Analogously, for a 2 we obtain Direct calculations of right eigenvector of D x f (x 0 , β) show that w 1 = −w 5 and w 2 = −w 6 , implying that Vol Clearly if v 1 , v 2 , w 1 , w 2 > 0, then the expression (35) is negative.From Lemma 3 of 28 we have v 1 , v 2 , w 1 , w 2 ≥ 0.Moreover, direct calculations show that v 1 w 1 , v 2 w 2 > 0, yielding a < 0. On the other hand, we find that the only second-order derivatives that appear in (30) that are non-zero are However, evaluating them at (x 0 , β) gives us Thus, We showed above that v 2 > 0 and w 1 = −w 2 , so b = 0 , which completes the proof.
In conclusion, we have shown that for R 0 < 1 , independently of the initial conditions, all the solutions of system (18) converge to the disease-free steady state, meaning that the prevalence of colonisation, defined as the size of colonised population divided by the size of total population, fades over time.On the other hand, for R 0 > 1 there is an endemic steady state, with a constant prevalence of colonisation over time.In this case, all solutions with non-zero initial colonised population converge to the endemic steady state.

Numerical simulations
To illustrate the pathogen spread in hospital-community pairs, we perform numerical simulations using the proposed model.First, we describe the dataset used to estimate the parameters of system (1).It allows us to estimate transfer parameters, and to compute various quantities characterising each hospital-community pair j, such as its basic reproduction number R j 0 or the average percentage of high-risk patients in the hospital.Next, we investigate how the pathogen spread is influenced by particular risk groups.Finally, we illustrate the impact of interventions on the basic reproduction numbers as well as the bacteria prevalence.

Dataset description
The dataset was provided by AOK Lower Saxony (AOK LS), a German health insurance company.It consists of all hospitalisation records of patients insured by AOK LS between January 1st, 2008 and December 31st, 2015.Each record contains the following information: patient's anonymised ID, birth year, sex, dates of admission and discharge, medical diagnosis codes (ICD-10 codes), anonymised ID of the healthcare facility where the patient has been admitted, and the code of state where the facility is located.
The AOK LS dataset contains 5 254 492 records in total, out of which 4 573 584 are from the facilities located in Lower Saxony.Since we do not have representative records from the facilities located in other states (due to the low coverage), we do not consider these data in our further analysis.
According to the data, there are 223 healthcare facilities in Lower Saxony.Among these, 60 have been inactive for at least 90 consecutive days, with no ongoing hospitalisations during that time span, hence we omit records from these facilities.As a consequence, we consider 163 facilities with 4 223 397 hospitalisations of 1 482 176 distinct patients.However, after removing hospitalisation records from timely inactive facilities, there are 62 313 patients with no hospitalisation records at all.These patients are excluded from further analysis as they do not contribute to any characteristic of any hospital nor community.Average yearly numbers of admissions vary between facilities, from 46.75 to 16343.38 hospitalisations.Further analysis of this dataset may be found in 34 .
Otherwise, we assign them to the low-risk group.The assignment of patients to specific groups based on ICD-10 diagnoses is related to the assumption that their immune systems can be compromised, or to their longer and more frequent stays in the hospitals.
In the considered dataset, we identify 226 607 high-risk patients and 1 193 256 low-risk patients.High-risk patients are generally older (see Fig. 1) and on average they have 5.18 hospitalisations during the considered  for high-risk and low-risk groups with respect to the total population of the risk groups.We see that there are proportionally more high-risk patients for almost all numbers of hospitalisations greater than 2. period, while low-risk patients have 2.55 hospitalisations.For both groups, the majority of patients (low-risk: 98.36%; high-risk: 90.08%) have not been hospitalised more than 10 times, c.f. Fig. 2.
For each patient, we compute the average length of the hospitalisation based on their records.The results are presented in Fig. 3.By taking the average of this value over the respective risk groups, we conclude that a high-risk patient spends on average 10.56 days in a hospital during a single hospitalisation, while a low-risk patient -7.56 days.For individual patients from the low-risk group, we observe a peak in the distribution of average length of hospitalisation between days 2 and 5, while for the high-risk patients, the peak lies between days 6 and 9, as presented in Fig. 3. On average, high-risk patients visit 1.93 facilities and low-risk patients -1.43 facilities (c.f.Fig. 4).
Using hospitalisation records, sorted by the admission date, we track the hospital and community stays of each patient.Every hospitalisation record is interpreted as a stay in a given hospital for a given number of days.The period between the date of discharge of a hospitalisation and the date of admission of the subsequent hospitalisation (including the discharge and admission dates) is interpreted as a stay in the community corresponding to For survival curves, we normalise the number of patients by dividing it by the total population of the respective risk group.We can see that for values not greater than 25 days (indicated by the dashed line) there are proportionally more low-risk patients with an average hospitalisation length shorter than the given value.However, for values greater than 25 days, the situation is the opposite.This means that low-risk patients often have either very short or very long average lengths of hospital stay.www.nature.com/scientificreports/ the former (i.e.most recently visited) hospital.Hence, each patient is considered to stay outside of the considered hospitals and communities before their first hospitalisation record and after their last hospitalisation record.
For each hospital-community pair and each risk group, we calculate the average length of stay in the hospital and in the corresponding community and denote them as (LOH j i ) and (LOC j i ) , respectively, where i ∈ {1, 2} indicates the risk group (1 -low-risk group, 2 -high-risk group) and j ∈ {1, . . ., 163} -the considered hospital- community pair.Let us emphasise that we consider a system of separated hospital-community pairs rather than an interconnected network model.Thus, we do not track patient transfers between such pairs.We also characterise hospital-community pairs by average pair size PS j , i.e. the average daily number of patients present in the pair j (in either hospital j or community j), according to the previously described rules.
The sum of all average pair sizes, denoted by N, is the total population.Furthermore, for each hospital, we compute p j HR , which stands for the average proportion of high-risk patients in the hospital for each day with a non-zero total hospital population.The results, presented in Fig. 5, vary substantially across different hospitals, as they range between 0.002 and 0.994, with the majority lying between 0.2 and 0.5 and the average being 0.318.
Next, we estimate the parameters of the model.Parameters α j i and ε j i of system (1) describing the discharge and admission rates, respectively, are approximated as see Fig. 6a,b.The values of parameters γ j i = γ 1 = γ 2 are taken from 17,18,35 (c.f. ( 36)).As the base values for the transmission parameters, we take β j 1 = β 1 and β j 2 = β 2 = 2β 1 , since we assume that in all hospitals the trans- mission risk is the same and that high-risk patients are more vulnerable to susceptibility-based transmission.Moreover, β 1 is selected in such a way that the average bacteria prevalence in the community, defined as the sum of the percentages of the colonised population in the community in the endemic steady state of each hospitalcommunity pair, multiplied by weights representing the ratio of average pair size to the total population, that is is close to 8.6%, i.e. the prevalence of carriage of ESBL-producing Enterobacteriaceae in a representative sample of the general adult Dutch society reported by Reuland et al. 36 .Thus, as a base value for the simulations, we use The values of transmission rates β 1 , β 2 can be additionally impacted by the interventions.In section "Prevalence of multiresistant pathogens", we discuss two such cases.In one of them, only β 2 is affected, with its value set as low as 0.0503 day −1 .In the other case, both transmission rates are decreased by up to 30% of the original values.Parameter values estimated from data for each hospital-community pair j are shown in Fig. 6.Hospitals are sorted in ascending order according to R j 0 value (calculated using formula (39)), c.f. Fig. 7b.For the vast majority of hospital-community pairs the lengths of stay of high-risk patients in the community between the hospitalisations ( LOC j 2 ) are shorter than for the low-risk group ( LOC j 1 ), yielding that ε 2 > ε 1 (159 pairs out of 163 considered).Additionally, for the majority of hospital-community pairs high-risk patients stay longer in hospitals than low-risk patients, thus α 2 < α 1 (131 pairs out of 163), see Fig. 6a,b.

Characterisation of hospital-community pairs
For each hospital-community pair j, we estimate parameters G j 1 and G j 2 .First, we postulate the following relation between the average proportion of high-risk patients in the j-th hospital ( p j HR ) based on the data and theoretical populations in the endemic steady state  23) and ( 37) it follows that for each hospitalcommunity pair j we have where ψ j i is an analogue of (10), easily calculated based on already estimated parameters, see Remark 3 and Fig. 7b.We found that 17 (out of 163 considered) pairs have R j 0 < 1 .However, this number depends on the trans- mission rates and thus it may be specific for a given pathogen and individual situation in the hospital.The latter is not taken into account in our simulations, as we assume that the transmission rates are the same in all hospitals.
In the following, we would also require the values of H * ,j i and C * ,j i , which one can estimate as using previously estimated parameters.
In Fig. 8a, we report βj values defined in (27) for each hospital-community pair j.As described in Sec- tion "Mathematical properties of the single hospital-community pair model with patient risk groups", βj is the critical value of the transmission parameter for the high-risk group for which we observe forward bifurcation in the system, assuming other parameters to be fixed.For 25 out of 163 pairs, the bifurcation occurs in the biologically non-feasible parameter region.In such cases the computed βj is negative, which follows from the fact that in these cases we have c.f. Eq. ( 27) for β.Thus, for the parameters estimated from the data, the bifurcation cannot occur for these pairs, no matter what the value of β 2 is.In such a case the disease-free steady state is always unstable and there always exists the endemic steady state.
In Fig. 8b, we report the average yearly number of admissions to each of 163 hospitals and classify the pairs according to ψ j i values.Clearly, condition ψ 2 ≥ ψ 1 holds for all hospital-community pairs.

Prevalence of multiresistant pathogens
Using the open-source SciPy library 37 , we perform numerical simulations of system (1) with parameters estimated for each of the separate 163 hospital-community pairs as presented above.Assuming that at the initial time ( t = 0 ) there are 8.6% colonised patients in communities (c.f. 36) and 17.2% , i.e. twice as many colonised patients in hospitals, for each simulation we set the following initial condition: (37) , where H * ,j i , C * ,j i are calculated according to (40). Figure 9a-f shows how the percentage of colonised patients changes over 3 000 days for each of the separate hospital-community pairs.In Fig. 9a,b, we illustrate the changes in bacteria prevalence within the whole population, while in Fig. 9c-f-for low-risk and high-risk groups, respectively.Plots in the left column represent the changes of the prevalence in the hospitals, whereas plots in the right column -changes in the communities.
In each figure, we observe a clear pattern that for considered hospital-community pairs with R 0 < 1 bacte- ria prevalence fades over time, while for pairs with R 0 > 1 it stabilises on some non-zero level.In addition, we observe that for the majority of hospital-community pairs with R 0 > 1, the prevalence in the high-risk group is much higher than in the low-risk group, see also Fig. 10a-f, where the point prevalence (at day 3000) in hospitals and communities for the cases presented in Fig. 9a-f is reported.
In Fig. 11, we present the solutions of system (18) over 10 000 days for two specific hospital-community pairs (pair number 12, where R 12 0 ≈ 0.866 , and pair number 100, where R 101 0 ≈ 1.364 ).Depending on whether R 0 is greater or less than 1, we observe different dynamics of the solutions to system (18).For R 0 < 1, solutions converge to the disease-free steady state, while for R 0 > 1, they converge to the endemic steady state.This agrees with the analytical results presented in Section "Mathematical properties of the single hospital-community pair model with patient risk groups".
In Tables 1 and 2, we present the sample Pearson correlation coefficients weighted by pair size for hospital (or community) prevalence and different characteristics of hospital-community pairs calculated using the following formula: where and x and y are the investigated variables represented by vectors, m w (x, w) is the weighted mean of vector x with weights vector w reporting the pair sizes.
Clearly, characteristics R 0 , ψ 1 and ψ 2 are strongly correlated with the prevalences, see Table 1, however, ψ 1 is less strongly correlated compared to R 0 and ψ 2 .In all cases, except parameters α i , correlations are positive.Table 1.Pearson weighted correlation coefficients for the hospital (or community) prevalence and: basic reproduction number R 0 , basic reproduction number of a given risk group ψ i , discharge rates α i and admission rates ε i .The weights correspond to the pair sizes.Hosp.prev.general stands for the fraction of colonised individuals in the hospital; Hosp.prev.LR-the fraction of colonised low-risk individuals in the hospital; Hosp.prev.HR -the fraction of colonised high-risk individuals in the hospital., the average proportion of high-risk patients in a hospital p HR and the total population of the high-risk group G 2 .The weights correspond to the pair sizes.Hosp.prev.general stands for the percent of colonised individuals in the hospital; Hosp.prev.LR -the percent of colonised low-risk individuals in the hospital; Hosp.prev.HR -the percent of colonised high-risk individuals in the hospital.The strongest correlation is observed between the basic reproduction numbers and the percent of colonised individuals in the hospitals (hosp.prev.general) and between the basic reproduction numbers and the percent of colonised individuals in communities (comm.prev.general), as well as between the basic reproduction numbers and the prevalences limited to particular risk groups.Thus, in both risk groups R 0 can be expected to be a better predictor of bacteria prevalence than the group's ψ i .We also note the fact that in terms of the absolute value, prevalences are more strongly correlated with ε i than with α i .Pair sizes, p HR , G 2 , and C * 2 reported in Table 2 do not correlate strongly with the prevalences.On the other hand, H * 1 and H * 2 are strongly positively correlated with the percentages of colonised individuals in both hospital and the community, whereas C * 1 is the only one correlated negatively with prevalences.In conclusion, the simulations demonstrate that quantitatively different cases are present in the regional healthcare system for Lower Saxony, under the assumption that inter-hospital ties are neglected.Depending on the basic reproduction number, the disease either eventually vanishes ( R 0 < 1 ), or it becomes endemic ( R 0 > 1 ).Since R 0 is derived from hospital admission/discharge statistics and pathogen transmission/recovery rates, it may be used to estimate the susceptibility of individual hospital-community pairs for a given pathogen.

Interventions
In order to evaluate the efficiency of prevention strategies, let us first consider the relationship between basic reproduction number R 0 and transmission rates for risk groups β 1 , β 2 .Clearly, from Eqs. ( 23), ( 10), ( 6) it follows that we have linear relationship between R 0 and β i ( i = 1, 2) > 0 is independent of both β 1 and β 2 for i = 1, 2 .Thus, the basic reproduction number is constant on lines l = K 1 β 1 + K 2 β 2 for l > 0.
The line R 0 = 1 , combined with lines β 1 = 0 and β 2 = 0 are boundaries of a triangle T within which R 0 < 1 and, consequently, disease-free steady state is globally asymptotically stable, see Fig. 12a.The hospital-level interventions are represented as the changes of one or both transmission rates.A successful eradicating intervention transforms a point from the area where R 0 > 1 to the area where R 0 < 1 (i.e. the triangle T).As presented in Fig. 12b, the intervention can impact only the first or only the second risk group (arrows A and C, respectively) or both of them at the same time (arrow B).In order to perform a successful eradicating intervention, the line between the initial and the final state must intersect the interior of the red triangle (Fig. 12b).In particular, it means that if b i ≥ βi then any intervention that influences only transmission rate β 3−i cannot successfully eradi- cate the bacteria.An example of such a situation for i = 1 is presented as point (b 1 , b 2 ) in Fig. 12b, i.e. it shows a case in which any intervention targeted to reducing the transmission among the high-risk patients only is insufficient for the complete bacteria eradication.There is an open interval of angles for which the arrow can transport the initial point to triangle T and it is given by π . So it is possible, that in some cases one of the transmission rates increases, but, nonetheless, the basic reproduction number will get less than 1.Thus, from a theoretical perspective, there can exist interventions that successfully eradicate bacteria, despite increasing transmission rate among one risk group.
One can also consider the introduction of more successful patient screening on admission (parameter σ impacted).The effects of all proposed types of interventions are shown in Fig. 13, where the decrease in bacteria prevalence in the respective hospital-community pairs is presented.These three types of interventions can be scaled to have similar effectiveness.The results of different interventions in the same hospital-community pair are comparable in the pairs with pair indices less than 135.Nevertheless, the differences are more pronounced for the remaining pairs, which are characterised by the smaller average pair size PS j or the unusually high or low proportion of the high-risk patients p j HR , c.f. Fig. 5. Therefore, to choose the most appropriate course of the   c,d) reducing transmission rate in the high-risk group and thus decreasing parameter β 2 ; (e,f) reducing transmission rate population-wide and thus decreasing parameters β 1 and β 2 at the same rate.Prevalence is expressed as the percentage of I j 1 + I j 2 among the whole hospital population and the percentage of W j 1 + W j 2 among the total community node population, respectively, after 3000 days from the start of the simulation, for each of the separate hospital-community pairs ( j = 1, . . ., 163) , calculated from the solutions to system (1) with initial condition (41).Hospital-community pairs are sorted in the same order as in 9.The size of markers is proportional to the average pair size PS j .intervention, a more advanced, optimisation-based decision process is needed, which would also take the costs of the intervention into consideration.
On the other hand, Fig. 14 shows how the R j 0 can change in comparison with the original value when recalculated after the considered interventions.Clearly, the most extreme interventions result in the largest reduction of basic reproduction numbers.We observe the same pattern as in Fig. 5, namely two distinct groups of hospital-community pairs: one consisting of pairs indexed from 1 to around 135, where we see a similar response to the interventions, and the other with a large diversity in results.Interestingly, Fig. 14b indicates that interventions focusing on the high-risk group applied to the hospitals with a low fraction of such patients may not achieve a satisfactory effect.This is not the case for the interventions focused on raising the effectiveness of screening or interventions concerning both risk groups simultaneously, where the effects are similar for all hospital-community pairs.

Discussion and summary
Model (1) is intended for simulations of hospital-acquired infection dynamics in a hospital and a community of patients coupled with this hospital.The model extends a previously published model 17 by stratification into low-and high-risk patients.Depending on the basic reproduction number, two scenarios are possible: either it is low enough for there to be only a disease-free steady state, or it is high enough to indicate the simultaneous existence of an endemic steady state.Moreover, the basic reproduction number for model (1) is simply a convex combination of basic reproduction numbers of 17 for decoupled risk groups.The mathematical analysis indicates that it is not possible to attain a mixed steady state, in which the endemic state is present in only one group (Remark 3): either it is disease-free, or endemic simultaneously in both groups.The mixed steady state would be only possible if these groups would be separated from each other.
Heterogeneity in prevalence for simulation results in different healthcare facilities is observed for both this and previous 17 model, and for both models, it is correlated with respective basic reproduction numbers.In addition, as expected, the simulations indicate that the prevalence in the high-risk group is generally higher than in the low-risk group.On the other hand, prevalence results for communities coupled with those facilities are not representative of the communities as a whole, since during simulations only a small community subset, present in the hospital records, was considered.
Thus, the theoretical analysis, as well as simulation results, indicate that the division into low/high-risk groups does not lead to qualitatively new dynamics at the population level, whereas quantitative behaviour depends on the exact parameter values.However, new light is shed on how the pathogen transfer dynamics affects the risk groups.
The strength of model (1) lies in the capability to simulate interventions addressed to a particular risk group.For the model without risk groups, targeted interventions lead to a substantial decrease in the prevalence 19 .However, the problem to overcome is the cost of such interventions.Simulations with model (1) demonstrate how division into risk groups may lower these costs.As demonstrated in Fig. 13a,b, a great reduction of the prevalence could be achieved by screening and decolonising patients on admission.But this is hardly a viable option, as such decolonisation or perfect isolation of positively-tested patients is not possible in practice, not to mention the additional burden due to intensive initial testing.However, it is a reference point.By introducing increased preventive countermeasures aimed at high-risk patients alone it is possible to obtain similar effects as through the process of screening (c.f.Fig. 13c,d).Moreover, the high-risk group is smaller than the low-risk group in most healthcare facilities, so the interventions would be applied to only a fraction of the patients.It introducing patient screening and thus increasing parameter σ ; (b) reducing transmission rate in the high-risk group and thus decreasing parameter β 2 ; (c) reducing transmission rate population-wide and thus decreasing parameters β 1 and β 2 at the same rate.Hospital-community pairs are sorted in the same order as in 9.Each vertical dashed line indicates the first pair for which recalculated R j 0 > 1 -the greater the intervention, the closer the line is to the right.The size of markers is proportional to the average pair size PS j .must be noted that despite the satisfactory reduction in the prevalence, these interventions do not guarantee a switch from the endemic state regime to the disease-free regime, as this may require a substantial reduction in transmission parameters, possibly in both risk groups (see Section "Prevalence of multiresistant pathogens").
Some limitations of this model come from the nature of the SIS-type ODE systems, as it is assumed that all the considered populations are homogeneous within themselves and well-mixed.Additionally, when it comes to the bacteria transmission process, we assume homogeneous mixing between the risk groups in hospitals.
As mentioned before, the simulation results might not realistically depict the transmission dynamics within the entire community outside of hospitals, as the considered dataset only accounts for individuals who visited a hospital at least once during the eight-year period.Furthermore, the results presented in this study are based on hospitals decoupled from each other.In further work, it would be beneficial to extend the model to introduce direct and indirect patient transfers between hospitals and to simulate targeted interventions within one or both risk groups.In particular, it is important to determine if the interventions based on the risk groups would be more effective than the interventions ignoring them.

Figure 1 .
Figure 1.Birth year structure of the AOK LS patients hospitalised in years 2008-2015, for high-risk and lowrisk groups.

Figure 2 .
Figure 2. (a) Number of AOK LS patients with a given number of admissions/hospitalisations for high-risk and low-risk groups.(b) Proportion of AOK LS patients with a given number of admissions/hospitalisationsfor high-risk and low-risk groups with respect to the total population of the risk groups.We see that there are proportionally more high-risk patients for almost all numbers of hospitalisations greater than 2.

Figure 3 .
Figure3.The average length of stay in a hospital for each patient from high-risk and low-risk groups in AOK LS dataset presented as a histogram (a) and as survival curves truncated at 80 days (b).For survival curves, we normalise the number of patients by dividing it by the total population of the respective risk group.We can see that for values not greater than 25 days (indicated by the dashed line) there are proportionally more low-risk patients with an average hospitalisation length shorter than the given value.However, for values greater than 25 days, the situation is the opposite.This means that low-risk patients often have either very short or very long average lengths of hospital stay.

Figure 4 .
Figure 4. (a) Number of AOK LS patients with a given number of visited hospitals for high-risk and low-risk groups.(b) Proportion of AOK LS patients with a given number of visited hospitals for high-risk and low-risk groups with respect to the total population of the risk group.

) γ 1 Figure 5 .
Figure 5. Proportion of high-risk and low-risk patients in each hospital averaged over time.Hospitals are sorted in ascending order based on R j 0 value, calculated using (39).

Figure 6 .and α j 2 ;and ε j 2 .Figure 7 . 2 ,
Figure 6.(a) Discharge parameters α j 1 and α j 2 ; (b) admission parameters ε j 1 and ε j 2 .Pairs are sorted in ascending order based on R j 0 value.Vertical dashed line indicates the first pair for which R j 0 > 1.The size of markers is proportional to the average pair size PS j .

Figure 8 . 2 ≥ ψ j 1 .
Figure 8.(a) βj value evaluated for the fixed model parameters as described in Section "Patient stratification and parameter estimation" (with omitted values: 2.58, −21.48, −6.45), biologically non-feasible values marked in red; (b) average yearly number of admissions, for each hospital-community pair ( j = 1, . . ., 163 ).There are no red markers in (b), as for all pairs ψ j 2 ≥ ψ j 1 .Pairs are sorted in ascending order based on R j 0 value.Vertical dashed line indicates the first pair for which R j 0 > 1.The size of markers is proportional to the average pair size PS j .

Figure 9 . 2
Figure 9. Bacteria prevalence in the hospitals (left column) and communities (right column), for (a,b) both risk groups (expressed as a percentage of I j 1 + I j 2 among the whole hospital population and percentage of W j 1 + W j 2 among the whole community population, respectively); (c,d) low-risk group (expressed as the percentage of I j 1 among the hospital population from low-risk group and percentage of W j 1 among the community population from high-risk group); (e,f) high-risk group (expressed as the percentage of I j 2 among the hospital population from high-risk group and percentage of W j 2 among the community population from high-risk group), over 3 000 days for each of the separate hospital-community pairs ( j = 1, . . ., 163 ), calculated from the solutions to system (1) with initial condition (41).Hospital-community pairs are sorted in ascending order based on R j 0 value.Vertical dashed line indicates the first pair for which R j 0 > 1..

Figure 10 .
Figure 10.Point prevalence in the hospitals (left column) and communities (right column), for (a,b) both risk groups (expressed as a percentage of I j 1 + I j 2 among the whole hospital population and percentage of W j 1 + W j 2

Figure 11 .
Figure 11.Solutions of system (18) with initial condition (41) for two selected hospital-community pairs, in which (a) R j 0 ≈ 0.866; (b) R j 0 ≈ 1.364 .Cross-marks indicate values of variables at the (a) disease-free steady state; (b) endemic steady state calculated analytically as indicated in "Mathematical properties of the single hospital-community pair model with patient risk groups" section.

Figure 12 .
Figure 12.(a) The value of basic reproduction number as a function of risk-based transmission rates; (b) Examples of possible interventions A, B, C for initial point (b 1 , b 2 ) for which R 0 > 1 .The red triangle represents the area that each successful intervention must cross.

Figure 13 .
Figure 13.Decrease in bacteria prevalence in the hospitals (left column) and communities (right column) for both risk groups, measured in percentage points (p.p.), after intervention (a,b) introducing patient screening and thus increasing parameter σ ; (c,d) reducing transmission rate in the high-risk group and thus decreasing parameter β 2 ; (e,f) reducing transmission rate population-wide and thus decreasing parameters β 1 and β 2 at the same rate.Prevalence is expressed as the percentage of I

Figure 14 .
Figure 14.Values of R j 0 calculated according to formula (39) before and after applying the interventions (a)introducing patient screening and thus increasing parameter σ ; (b) reducing transmission rate in the high-risk group and thus decreasing parameter β 2 ; (c) reducing transmission rate population-wide and thus decreasing parameters β 1 and β 2 at the same rate.Hospital-community pairs are sorted in the same order as in 9.Each vertical dashed line indicates the first pair for which recalculated R S 1 , S 2 )

Table 2 .
Pearson weighted correlation coefficients for the hospital (or community) prevalence and: the pair sizes, hospital and community populations of risk groups i in a steady state, respectively H * i and C * i